diagnosis, start of therapy, end of therapy, start of improvement or remission, date of relapse, or
others. For events, you should record date of each event if it recurs, and even if death is not the
event of interest, date of death should be recorded if available. For censoring purposes, ensure
that you are collecting dates of contact so you can identify a last-seen date if needed. If you
collect your data properly, you will later be able to calculate any time interval needed, as well as
create an event status indicator needed.
Dates and times should be recorded to suitable precision. If your study timeline is years, it’s best to
keep track of dates to the day. In a Phase I clinical trial (see Chapter 5), participants may be studied
for events that happen in a span of a few days. In those cases, it’s important to record dates and times
to the nearest hour or minute. You can even envision laboratory studies of intracellular events where
time would have to be recorded with millisecond — or even microsecond — precision!
Dates and times can be stored in different ways in different statistical software (as well as
Microsoft Excel). Designating columns as being in date format or time format can allow you to
perform calendar arithmetic, allowing you to obtain time intervals by subtracting one date from
another.
Miscoding censoring information
It can be surprisingly easy to miscode the event status indicator. If the name of the variable is Death,
and is coded as 1 if the participant died during the observation period and 0 if they were censored, this
seems intuitive. But analysts may want to identify all the censored observations in their data, so they
may create a censored indicator named Censored, and code it as 1 if the participant is censored, and 0
if they are not. Because data may be used for different types of survival analyses, there could be other
event indicators included in the data as well also coded as 1 and 0.
The problem is that if you accidentally use your censored indicator instead of your event indicator
when running your survival analysis, you will unknowingly flip your analysis, and you won’t get any
warning or error message from the program. You’ll only get incorrect results. Worse, depending on
how many censored and uncensored observations you have, the survival curve may also not hint at any
errors. It may look like a perfectly reasonable survival curve for your data, even though it’s completely
wrong.
You have to read your software’s documentation carefully to make sure you code your event
variable correctly. Also, you should always check the program’s output for the number of
censored and uncensored observations and compare them to the known count of censored and
uncensored participants in your data file.